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We study in details the turnout rate statistics for 77 elections in 11 different countries. We 
show that the empirical results established in a previous paper for French elections appear to hold 
^— ■») ' much more generally. We find in particular that the spatial correlation of turnout rates decay 

, logarithmically with distance in all cases. This result is quantitatively reproduced by a decision 

model that assumes that each voter makes his mind as a result of three influence terms: one totally 
^ ' idiosyncratic component, one city-speciflc term with short-ranged fluctuations in space, and one 

long-ranged correlated field which propagates diffusively in space. A detailed analysis reveals several 
interesting features: for example, different countries have different degrees of local heterogeneities 
and seem to be characterized by a different propensity for individuals to conform to the cultural 
norm. We furthermore find clear signs of herding (i.e. strongly correlated decisions at the individual 
(— I , level) in some countries, but not in others. 
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I. INTRODUCTION 



Empirical studies and models of election statistics have attracted considerable attention in the recent physics 
literature, see e.g. In [T^], the present authors have studied the statistical regularities of the electoral turnout 

f~| ' rates, based on spatially resolved data from 13 French elections since 1992. Two striking features emerged from our 
f*) I analysis: first, the distribution of the logarithmic turnout rate r (defined precisely below) was found to be remarkably 
stable over all elections, up to an election dependent shift. Second, the spatial correlations of r was found to be well 
approximated by an affine function of the logarithm of the distance between two cities. Based on these empirical 
^ ■ results, we proposed that the behaviour of individual agents is affected by a space dependent "cultural field" , that 
' encodes a local bias in the decision making process (to vote or not to vote), common to all inhabitants of a given 
CSl , city. The cultural field itself can be decomposed into an idiosyncratic part, with short range correlations, and a slow, 
' long-range part that results from the diffusion of opinions and habits from one city to its close-by neighbours. We 
showed in particular that this local propagation of cultural biases generates, at equilibrium, the logarithmic decay of 
spatial correlations that is observed empirically [l0|. 
' The aim of the present note is to provide additional support to these rather strong statements, using a much larger 
set of elections from different countries in the world. We discuss in more depth the approximate universality of the 
" , [ ' distribution of turnout rates, and show that some systematic effects in fact exist, related in particular, to the size of the 
J> . cities. We also confirm that the logarithmic decay of the spatial correlations approximately holds for all countries and 
all elections, with parameters compatible with our diffusive field model. The relative importance of the idiosyncratic, 
city dependent contribution and of the slow diffusive part is however found to be strongly dependent on countries. 
' We also confirm the universality of the logarithmic turnout rate for different elections, for different regions or for 
different cities, provided the mean and the width of the distribution is allowed to depend on the city size. Overall, 
our empirical analysis provides further support to the binary logit model of decision making, with a space dependent 
mean (the cultural field mentioned above). 



II. DATA AND OBSERVABLES 



We have analyzed the turnout rate at the scale of municipalities for 77 elections, from 11 different countries. For 
some countries, the number of different elections is substantial: 22 from France (Fr, « 36000 municipalities in mainland 
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France) [Til, 13 from Austria (At, « 2400 municipalities) [11], 11 from Poland (PI, « 2500 municipalities) [11], 7 
from Germany (Ge, « 12000 municipalities) while for others we have less samples: 5 from Canada (Ca, w 7700 
municipalities) |15|. 4 from Romania (Ro, w 3200 municipalities) [iBj, 4 from Spain (Sp, « 8000 municipalities in 
mainland Spain) [l7|. 4 from Italy (It, « 7200 municipalities in mainland Italy) ]18], 3 from Mexico (Mx, « 2400 
municipalities) [l9[ , 3 from Switzerland (CH, « 2700 municipalities) 20] and 1 from Czech Republic (Cz, « 6200 
municipalities) |21]. More details on the nature of these elections and some specific issues are given in Appendix. 

For each municipality and each election, the data files give the total number of registered voters N and the number 
of actual voters iV+, from which one obtains the usual turnout rate tt = N^-/N G [0, 1]. For reasons that will become 
clear, we will instead consider in the following the logarithmic turnout rate (LTR) t, defined as: 

7T 

r := ln(- ), tg]— oo,+oo[. (1) 

1 — TT 

Because we know the geographical location of each city, the knowledge of r for each city enables us to create a map 
of the field T(r) and study its spatial correlations. 



III. STATISTICS OF THE LOCAL TURNOUT RATE 



Whereas the average turnout rate is quite strongly dependent on the election (both on time and on the type of 
election - local, presidential, referendum, etc.), the distribution of the shifted LTR r— (r) was found to be remarkably 
similar for the 13 French elections studied in ]1O]0 The LTR standard-deviation, skewness and kurtosis were found 
to be very similar between different elections. The distribution P{u) of the shifted and rescaled LTR, 

U=^-^, with (7^ = (t2) - (t)2 (2) 

a 

was found to be very close in the Kolmogorov-Smirnov (KS) sense. 

We have extended this analysis to the 9 new election data in France, and to all new countries mentioned above. For 
France, the Elections Municipales (election of the city mayor), not considered in (loj . have a distinctly larger standard 
deviation than national elections. However, P{u) is again found to be similar for all the French elections, except the 
Regionales of 1998 and 2004. These happen to be coupled with other local elections in half municipalities, which 
clearly introduces a bias. The distributions P{u) for all elections in France are shown in Fig. [T] and compared to a 
Gaussian variable. The distribution is clearly non Gaussian, with a positive skewness equal to « 1.1 and a kurtosis 
equal « 4.8. A more precise analysis consists in computing the KS distances between each pair of elections. We recall 
here that a KS distance of (Iks = 1 corresponds to a w 20% probability that the two tested distribution coincide, 
while (Iks = 1-6 corresponds to a « 1% probability. Removing the Regionales, we find that the KS distance (Iks 
averaged over all pairs of elections is equal to 1.49, with a standard deviation of 0.47. These numbers are slightly too 
large to ascertain that the distributions are exactly the same since in that case the average cIks should be equal to 
sJ'K jl X ln2 « 0.87. On the other hand, these distances are not large either (as visually clear from Fig. [1]), meaning 
that while systematic differences between elections do exist, they are quite small. We will explain below a possible 
origin for these differences. 

The same analysis can be done for all countries separately; as for France, we find that P{u) for different elections are 
all similar, except for Germany for which (das) = 3 ~ see TablelH where we show the mean and the standard-deviation 
of KS distances between elections of a given country, and of the skewness and kurtosis of the distributions P{u) in 
a given country. Note that the values of {(Iks) are close to 0.87 for Italy and Poland. On the other hand, these 
distributions is clearly found not to be identical across different countries. Table HIl shows the matrix of KS distances 
between countries "super-distributions" The values of dxs are all large, except for the pairs Fr-Cz, Fr-CH, Sp-CH, 
Sp-Ro and CH-Cz. 

In order to understand better these results, one should first realize that the statistics of the LTR does in fact 
strongly depend on the size of the cities. This was already pointed out in [13, ^oi example, the average LTR for 
all cities of size N (within a certain interval), that we denote as {t)^ = itin, is distinctly N dependent, see Fig. [2] In 
most cases, the average turnout rate is large in small cities and declines in larger cities, with notable exceptions: for 



^ The notation (...) means a fiat average over aii cities (i.e. not weighted by the population N of the city). 

^ A "super-distribution" of r of a country is obtained by aggregating the appropriately shifted LTR distributions over all "compatible" 
elections. Compatible elections have roughly the same distribution P(t— (r)), i.e. without normalization by its standard-deviation. They 
are chosen as follows: for Canada and Poland all elections; for France all pure national elections (nor combined with local elections, i.e. 
all elections apart from 1998-rg, 2004-rg, 2001-mun and 2008-mun); for Mexico 2003-D and 2009-D; for Germany 2005-D and 2009-D; 
all Chamber of Deputies (D) elections for Austria, Spain, Italy and Switzerland; and for Romania, all elections apart from its European 
Parliament election (see Appendix for more details). 
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Country cLks skewness kurtosis 


Country dKS skewness kurtosis 


p^, 1.49±0.47 1.08±0.15 4.8±1.3 
(1.42±0.45) (1.10±0.14) (5.1±0.9) 


1.44±0.54 0.10±0.38 0.53±0.81 
(0.93±0.19) (-0.13±0.21) (0.54±0.43) 


pj 0.80±0.20 0.12±0.26 0.38±0.42 
(0.80±0.20) (0.12±0.26) (0.38±0.42) 


3.0±1.1 0.48±0.30 1.6±0.9 

Ge 

(0.81) (0.20±0.05) (1.53±0.04) 


1.78±0.68 0.27±0.25 1.8±1.1 

Sp 

(1.24) (0.07±0.21) (2.5±1.2) 


0.70±0.09 -0.45±0.11 1.01±0.02 
(0.68) (-0.45±0.15) (1.01±0.003) 


1.67±0.43 0.51±0.08 1.4±1.4 
(0.47) (2.9) 


1.28±0.35 0.32±0.09 1.1±0.8 

Mx 

(1.19) (0.35±0.11) (1.6±0.3) 


1.23±0.39 -0.40±0.39 4.4±0.9 
(1.23±0.39) (-0.40±0.39) (4.4±0.9) 


1.06±0.39 0.05±0.43 1.5±0.4 

Ro 

(0.95±0.36) (-0.14±0.25) (1.6±0.4) 



TABLE I. Mean and standard-deviation of KS distances {(Iks) between all pairs of elections within each country. Mean and 
standard-deviation of skewness and kurtosis of distributions of r over all municipalities is also given for each country. In 
parentheses, the same measures but restricted to compatibles elections in each country. 



example, the trend is completely reversed in Poland, with more complicated patterns for parliament elections in Italy 
or Germany. Similarly, the standard-deviation of r, ctjv, also depends quite strongly on N (see Figs |4] and [5] below). 

However, the distribution Qn{v) of the rescaled variable v = {t — mN)/o'N over all cities of size N for each election 
can be considered to be universal from a KS point of view, both within the same country for different N but now 
also, when TV is large enough, across different countries. For example, the average KS distance between distributions 
corresponding to different ranges of N in France is equal to 0.58, with standard-deviation 0.12. These numbers are 
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TABLE II. Kolmogorov-Smirnov distance between different "super-distributions". 
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FIG. 2. Average value, mjv, 
of the conditional distribu- 
tion P{t\N), for all coun- 
tries and all elections. These 
quantities are obtained as av- 
erages over bins with 100 
(200 for France) municipali- 
ties of size « A^. 
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1.47±0.77 


1.38±0.65 


0.94±0.48 


0.91±0.46 


0.95±0.48 



TABLE III. Mean and standard-deviation over all pairs of countries of the KS distance dxs between the aggregated Q]v(«) 
distributions in each country, for different values of A''. 



respectively 0.72 ± 0.20, 0.58 ± 0.13 and 0.87 ± 0.36 for Italy, Spain and Germany0 In Table[ini we show for different 
bins of N values the mean and standard-deviation KS distance between countries, illustrating that all distributions 
are statistically compatible, at least when N is large enough. 

Now, even if Qn{v) is universal and equal to P{u) will reflect the country-specific (and possibly election- 

specific) shapes of toa? and gn, and the country-specific distribution of city sizes, p{N). Indeed, one has: 

P(r)=5:p(Ar)Q*(L^), (3) 

which has no reason whatsoever to be universal. But since for a given country the dependence on N of mAr,(Tjv and 
p{N) tends to change only weakly in time, the approximate universality of P{u) for a given country follows from that 
of Qn{v)- In fact, French national elections can be grouped into two families, such that the dependence of ttt-at on 
N is the same within each family but markedly different for the two families (see next section and Fig. [3] below). 
Restricting the KS tests to pairs within each families now leads to an average KS distance of w 1.25 with a standard 
deviation « 0.4 (identical for the two families), substantially smaller than dxs = from Table H] This goes to show 
that the election specific shape of tojv is indeed partly responsible for the weak non- universality of P{u). 



^ We have excluded the smallest cities, N < 200, that are have a distinctly larger KS distance with other cities - see below. Bins, ranked 
according to the municipality size N contain each around 500 municipalities. 
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Country 


1000 < iV < 2000 


2000 <N < 4000 


4000 < iV < 8000 


8000 < TV < 16000 


16000 < TV 
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2.15 
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0.48 
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0.69 


0.44 
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1.09 


0.60 


0.53 


0.59 
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1.73 


1.48 


1.14 


0.63 


0.92 



TABLE IV. KS distance between Qn{v) and a normalized Gaussian for different ranges of A'^ and for different countries. 



Country (Iks 
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Fr 


2.55 


-0.02 


0.31 


At 


2.63 


-0.05 


0.15 


PI 


2.13 


0.18 


0.58 


Ge 


4.09 


-0.21 


0.05 


Sp 


1.03 


-0.16 


0.41 


It 


5.61 


-0.67 


0.79 


Cz 


0.83 


-0.32 


0.30 
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1.21 


0.12 


-0.06 


CH 


1.85 


0.24 


0.88 


Ca 


2.93 


-0.75 


2.14 


Ro 


2.36 


-0.06 


1.25 
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1000 < A < 2000 


2.25 


-0.07 


0.43 
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3.50 


-0.12 


0.44 
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2.90 


-0.12 


0.42 


8000 < A < 16000 


1.74 


-0.13 


0.31 


16000 < A 


1.74 


-0.19 


0.43 



Tab. Eb 



Tab. |V}a 



TABLE V. KS distance (dKs) to a standardized Gaussian, and low-moment skewness (skew) and kurtosis (kurt) of aggregated 
distributions Q*{v). Tab.FVT-a: data are aggregated over all A for each country. Tab.FVT-b: data are aggregated over all countries 
for fixed A. 



Zooming in now on details, we give in Table Hvl the KS distance between Qn{v) aggregated over all elections of a 
country and a normalized Gaussian, for different ranges of N and different countries. The skewness and kurtosis of 
the distribution Q*{v) and the KS distance to a Gaussian, aggregated over all N, are given in Table IVTa for different 
countries, and aggregated over countries for fixed N in Table |V]-b. Two features emerge from these Tables: 

• While for some countries (Cz, Sp, Mx) the deviation of Qn{v) from a Gaussian appear small (both measured 
by KS or by the skewness and kurtosis), such an assumption is clearly unacceptable for Italy and Germany, for 
which the KS distance is large for all N (see Table IIV[) and a substantial negative skewness can be measured. 
Furthermore, the aggregated distribution (over all A^) is clearly incompatible with a Gaussian except in the 
Czech Republic, Spain and Mexico. 

• There is an interesting systematic N dependence of the distance to a Gaussian, which is on average smaller for 
larger A^s, and maximum for small cities. This suggests that although the KS tests is unable to distinguish 
strongly the QNi^) for different A'^, there is in fact a systematic evolution for which we provide an argument 
below. In fact, as clearly seen in Table Hill the average KS distance between the Qat of different countries is 
also systematically smaller as A^ increases. 



IV. A THEORETICAL CANEVAS 



In order to delve deeper into the meaning of the above results, we need a theoretical framework. In [T3|, we proposed 
to extend the classical theory of choice to account for spatial heterogeneities. A registered voter i makes the decision 
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to vote (Si = 1) or not (Si = 0) on a given election. We can view this binary decision as resulting from a continuous 
and unbounded variable ipi s] — oo, +oo[ that we called intention (or propensity to vote). The final decision depends 
on the comparison between (pi and a threshold value —^th'- Si — 1 when (pi > —^th, and Si = otherwise. In p^ . 
the intention ipi{t) of an agent at time t who lives in a city a, located in the vicinity of Ra, was decomposed as: 

f,{t)=e,{t)+^{Ra,t)+fia{t); (4) 

where ei{t) is the instantaneous and idiosyncratic contribution to the intention that is specific to voter i, and (j){R,t) 
and Ha{t) are fields that locally bias the decision of agents living in the same area. The first field (j) is assumed to be 
smooth (i.e. slowly varying in time and space), as the result of the local infiuences of the surroundings. This is what 
we called a "cultural field" , that transports (in space) and keeps the memory (in time) of the collective intentions. The 
second field /Za, on the other hand, is city- and election-specific, and by assumption has small inter-city correlations. 
It reflects all the elements in the intention that depend on the city: its size, the personality of its mayor, the specific 
importance of the election that might depend on the socio-economic background of its inhabitants, as well as the 
fraction of them who recently settled in the city, etc. (See [l^ for a more thorough discussion of Eq. (|4]).) 

Consider now N agents living in the same city, i.e. with under the influence of same field values (j) and fi. The 
turnout rate tt is by definition: 



1 ^ 



i=l 



For N sufficiently large, and if the agents make independent decisions, the Central Limit Theorem tells us that: 



where p = V{f > —^th) is the probability that the conviction of the voter is strong enough, and ^ is a standardized 
Gaussian noise. If, on the other hand, agents make correlated decisions (for example, everybody in a family decides 
to vote or not to vote under the influence of a strong leader) , one expects the variance of the noise term to increase 
by a certain "herding" factor h > I, which measures the average size of strongly correlated groups. Therefore we will 
write more generally: 



hp{l-p) 

"""^p+y — N — ^- 

Following a standard assumption in Choice Theory, we take the idiosyncratic e's to have a logistic distribution with 
zero mean and standard-deviation E, in which case the expression of p becomes: 

l + exp(-^±Siy 
This allows one to obtain a very simple expression for the LTR r: 




Np{l~p)^' 

where (3 = l/E. Therefore, in this model, the statistics of r directly reflects that of the cultural and idiosyncratic 
flelds. 

Let us work out some consequences of the above decomposition, and how they relate to the above empirical flndings. 
Since the cultural field (j) is by definition not attached to a particular city, it is reasonable to assume that (j) and /3 are 
uncorrelated. Without loss of generality, one can furthermore set {(p) = (fi) = 0. Therefore: 

{t)n = mN^ {P)N^th + (/3/i>w. (10) 

Two extreme scenarios can explain the N dependence of niN'- one is that the dispersion term (/?) is strongly N 
dependent while the statistics of is iV independent, the other is that /3 is essentially constant and reflects an 
intrinsic dispersion common to all voters in a population, while the average of the city-dependent field /i depends 
strongly on the size of the city. Of course, all intermediate scenarios are in principle possible too, but the data is 
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FIG. 3. Shifted {t)jv — rriN as a func- 
tion of N for French elections. Three 
families of elections clearly appear. a) 
Top curves: "important" national elections 
(Presidential, Referendums, Parliament); b) 
Bottom curves: less important national elec- 
tions (European, Regionales); and c) Mid- 
dle curves: Municipales (see text). Each 
point comes from the average over around 
200 communes of size ~ TV. 



not precise enough to hone in the precise relative contribution of the two effects. Here, we want to argue that the 
dependence of /i on iV is hkcly to be dominant. Indeed, if the first scenario was correct, one should observe: 

niN = {t)n « {P)N'^th (11) 

The decrease of itin as a function of N would therefore mean that {I3)n itself is a decreasing function of N when the 
mean LTR is positive. This is a priori reasonable: one expects more heterogeneity (and therefore a larger S, and a 
smaller /3) in large cities than in small cities. However, the same model would imply a smaller dependence on N for 
low turnout rates, and even an inverted dependence of mjv on N for elections with a very low turnover rate, such that 
(r) < 0. This is not observed: quite on the contrary, the niN dependence is compatible with a mere vertical shift for 
similar elections, see Fig. [3l 

On the other hand, a model where /? is constant, independent of N and to a first approximation on the election, 
leads to: 

{t)n [<^th + (/^)Ar] , (12) 

which appears to be a good representation of reality. The dependence of {^J-)N ~ the average propensity to vote - on 
N, could be the result of several intuitive mechanisms: for example, voters in small cities are less likely to be absent 
on election day (usually a Sunday in France); the result of an election is sometimes more important in small cities 
than in large cities (for example, election of the mayor); the social pressure from the rest of the community is stronger 
in small cities; all these effects suggest that the average turnout rate is stronger in small cities. In order to explain the 
opposite behaviour (as in Poland), or a non- monotonous dependence, as in Italy or Germany for parliament elections, 
a systematic dependence oi /3 on N might be relevant, although one should probably dwell into local idiosyncracies. 

Figure [3] suggests that in France three families of elections clearly appear: a) "important" national elections 
(Presidential, Referendums, Parliament), for which ttin shows a change of concavity around N — 1000; b) less 
important national elections (European, Regionales) for which the average turnout is low, for which the change of 
concavity is absent; and c) Municipales for which the variation of rrijv between small and large cities is the largest (as 
can be expected a priori). Note that the difference Am between the mean LTR for small and large cities is markedly 
different in the three cases: Am 0.7 in case a), Am 0.95 in case b), and Am « 1.65 in case c). 

As a first approximation, we thus take /3 to be constant for all cities. The standard-deviation of r over all cities of 
a given size then writes: 

a% = 13' [(02) + (^2^^ _ ^^^2^] ^ (13) 

We show in Fig. 2] the quantity a'^ minus the trivial binomial contribution, i.e. the last term of the RHS of the 
above equation, as a function of N~^^^, for French elections. As predicted by the above model, we see that the 
N ^ oo limit is clearly positive « 0.035 ± 0.05, and to a good approximation independent of the election - including 
the Municipales: although the dependence of aj^ is found to be markedly different (as N~^/'^), this quantity still 
extrapolates to the same asymptotic value. If one believes that our interpretation of (/) as a persistent cultural field 
is correct, there is in fact no reason to expect that tr^ = {(/)'') should change at all from election to election. The 
above result is therefore compatible with the fact that /3 is to a first approximation election independent, as already 
suggested by Fig. [3] above. The same results hold for all other countries, although the statistics is not as good as in 



8 



0,5 - 



Z 

A 0.4- 



2d01-mun 
2008-nliin 




FIG. 4. 



„2 



as a function 



v]Vp(l-p)' 

of for French elections. Each point 

conies from around 300 communes of size 
fti TV. Dashed hne: /J^ct^ « 0.035 as ex- 
tracted from the spatial correlations of r (cf. 
Tab.lVg. The 1998 and 2004 Regionales elec- 
tions are excluded here. 



the case of France: the asymptotic value of cr^ for TV — > cx) is only weakly dependent on the election, and /3^cr^ in 
the range 0.03 — 0.12 for all countries. Furthermore, the A^-dependence of cr^ is found to be roughly compatible with 
A''"" with w < 1 in all cases. 

If /3 is constant, the iV-dependent contribution of ct]^ must come from the variance of the city-specific contribution 
fi. A simple-minded model for the statistics of fi predicts a variance that should decrease as N~^. Indeed, a large city 
can be thought of as a patchwork of rt oc TV independent small neighbourhoods, each with a specific value of fj,. The 
effective value of for the whole city has a variance that is easily found to be reduced by a factor n, and therefore 
aj^ oc N~^. A weaker dependence of aj^ on N signals the existence of strong inter-neighbourhood correlations (or 
strong heterogeneities in the size of neighbourhoods) , that lead to a reduction of the effective number of independent 
neighbourhood from n oc to n oc N'^ with a; < 1. These inter- neighbourhood correlations are indeed expected, since 
some of the socio-economic and cultural factors affecting the decision of voters are clearly associated to the whole 
city. Interestingly, these correlations should be stronger for local elections, which is indeed confirmed by the fact that 
uj is markedly smaller for the Municipales elections in France. We therefore find the interpretation of the anomalous 
TV dependence of tr^ as due to the city-specific contribution /i rather compelling. 

Let us now turn to the distribution of the rescaled variable v. Within the above model, and again assuming that 
P is constant, one finds that: 

^'""^ x/3(0-KM-(M)iv) + i/ (14) 




o-w \Np{l-p) 



The last "binomial" term quickly becomes Gaussian as TV increases, and is at least four times smaller than the first 
two terms when TV > 1000 (when h — 1). Since the cultural field (f> is, according to the model proposed in [T^, the 
result of averaging random influences over long time scales and large length scales, one expects, from the Central Limit 
Theorem, that (f> is close to a Gaussian field as well. However, the statistics of /j, has no reason to be Gaussian for 
small cities TV, for which it reflects local and instantaneous idiosyncracies, and for which no averaging argument can 
be invoked. The "universality" of Qn{v) across countries is therefore probably only apparent, since there is no reason 
to expect that the distribution of /i is independent of the country. In fact, Qn{v) in countries like Italy, Germany & 
the Czech Republic do exhibit a stronger skewness than in other countries. Still, according to the above discussion, 
the contribution of different neighbourhoods to /i must average out as TV increases, and one expects the distribution 
of ^ itself to become more and more Gaussian as TV increases. 

To sum up: the random variable v is the sum of three independent random variables, two of which can be considered 
as Gaussian, while the third has a distribution that depends on TV and becomes more Gaussian for large TV, with 
a variance that decreases as TV~". This allows one to rationalize the above empirical findings on the distributions 
Qn{v)'- these are more and more Gaussian as TV increases, and closer to one another for different countries, since the 
country specific contribution n becomes smaller (as N~'^) and itself more Gaussian. 

It is instructive to compare the relative contribution to the variance of the turnout rates of the cultural field (p 
on the one hand, and of the city-specific field on the other. The latter can be obtained by subtracting from the 
total variance of the LTR, cr^, the contribution of the cultural field /3^cr^ which is obtained as the extrapolation of 
cr^ to TV ^> oo (see Figs|4]&[5]) and the average contribution of the binomial noise, {h/Np{l — p))- The herding 
factor h can be estimated using the method introduced in (To| . which compares different elections for which the 
binomial noises are by definition uncorrelated (see Eq. (10) of Ref. 10]). The ratio of r = c^/o"! can be seen as 
an objective measure of the heterogeneity of behaviour in country, i.e. how strongly local idiosyncracies can depart 
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from the global trend. Table I VII gives the ratio r for all studied countries. Using this measure, we find that the 
most heterogeneous countries are Canada and the Czech RepublicQ and the most homogeneous ones are Austria, 
Switzerland and Romania. Not surprisingly, however, the largest value of r is found for the French Municipales, i.e. 
local elections, for which idiosyncratic effects are indeed expected to be large. Note also that the herding ratio is 
anomalously high for Romania {h = 8.5), and quite substantial for Poland (h = 4.7). Finally, it is interesting to notice 
that the quantity /3tT0 depends only weakly on the country (it varies by a factor 1.7 between France and Italy). Since 
the total intention if is only defined up to an arbitrary scale, one can always set = 1. Therefore, we find that the 
idiosyncratic dispersion 1//3 (or the propensity not to conform to the norm encoded by the cultural field) is strongest 
in France, Poland and the Czech Republic, and weakest in Italy and Austria. 



V. SPATIAL CORRELATIONS OF TURNOUT RATES 

Another striking empirical finding reported in [l^ [g^l is the logarithmic dependence of the spatial correlation of 
the LTR as a function of distance. The spatial pattern of the local fluctuations of the LTR in European countrie are 
shown in Fig. jB] One clearly sees the presence of long-ranged correlations. More precisely, for the 13 French elections 
studied there, one finds that the spatial correlation of T'{Ra) = T{Ra) — rriM^ (where is the spatial location of the 
city and m^r is the average of r over cities of similar sizes) decreases as: 

C{r) = (r'(i? + r)T'{R)) « -Co In ^, (15) 

where L is of the order of the size of the country. We show in Fig. [Tjthe average C(r) for all French elections (except 
the two Municipales elections) and in Fig. [Sjthe normalized correlation functions for all elections, separately for each 



* Although the ratios for Ca, Mx, Cz and Ge might be overestimated because the data did not allow us to estimate the herding ratio h 
in these two cases. 
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0.13 
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0.035 
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0.03 
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1 
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0.035 


0.035 


0.045 
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At 


0.13 


2.9 
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0.09 


0.14 


0.025 


0.015 


0.17 


PI 


0.085 


4.7 
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0.035 


0.065 


0. 


0.05 
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Ge 


0.15 


0.* 


1/4 


0.05 


0.105 


0.01 


0.09 


1.8 


Sp 


0.195 
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1/8 


0.06 


0.115 


0.035 


0.10 


1.7 


It 


0.15 
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1/4 


0.10 


0.10 


0.02 


0.03 


0.3 


CH 


0.155 


0.6+ 


1/2 


0.065 


0.105 


0.015 


0.075 


0.85 


Cz 


0.165 


NA* 


1/2 


0.035 


0.035 


0.025 


0.105 


3. 


Ro 


0.11 


8.5 


1/2 


0.07 


NA 


0.015 


0.025 


0.36 


Ca 


0.2 


1^ 


1/2 


0.03 


NA 


0.015 


0.155 


5.1 


Mx 


0.27 


o.t 


1/2 


0.1 


NA 


0.002 


0.17 


1.7 



TABLE VI. Decomposition of the total LTR variance into a cultural field component /3^cr|, and city-specific component /J'^cr^, 
and a binomial component, (Ji/{Np[l — p)), corrected by a herding coefficient h > 1. This last term is determined using the 
method proposed in 10], which leads to a herding coefficient h given in the second column, f : when the direct fit gives a value 
of h less than unity, we enforce h — 1. *: the case of Germany seems to be special, maybe due to a large fraction of postal 
votes. 0: the method to determine h requires more than one election, and therefore cannot be applied to the Czech Republic. 
In this case, we also set /i = 1 by default. \>: Missing data prevents us from determining h precisely, so we again set /i = 1 by 
default. The value of the exponent oj is only indicative, since in some countries the power-law assumption is not warranted, see 
Fig. [5l We give two values for P^a'^: one as the asymptotic extrapolation of a% — {h/{Np{l — p)) for N ^ oo and the second 
from the rescaling coefficient C* , see below and Fig. [§1 Both these determinations are only precise to within roughly ±20%. 




FIG. 6. Heat map of the normalized logarithmic turnout rate ^ J^" i for the 2004 European Parliament election in France, 
Germany, Italy, Poland and Spain. Germany had nomenclature reform of their municipalities which make more difficult to 
efficiently join spatial data to electoral data. Note the strongly heterogenous, but long-range correlated nature of the pattern. 
Note also some strong regionalities, for example in the German regions of Sarre or Bade-Wurtemberg, where the average turnout 
rate is strong and sharply falls across the region boundaries. In these cases, the implicit assumption of a translation invariant 
statistical pattern that we make to compute C(r) is probably not warranted, and it would in fact be better to treat these 
regions independently. 

country for which the geographic position of cities is available to us. 

Using the above decomposition, and noting that by assumption the fluctuations of around the suitable size 

dependent average {fJ-)N have short-ranged correlations, one concludes that the long-range, logarithmic correlations 
above must come from those of the cultural field (f>. One indeed finds: 

C7(rVO)«(0(i? + rO0(i?)), (16) 

since the other two terms only contribute for r — 0. As a consistency check of this decomposition, one should find 
that C(r) should quickly decay from C(r = 0) to C(r — > 0+) /S^cr^ (e.g. « 0.035 ± 0.05 for France). This is indeed 
seen to be well borne out, see Fig. [T] The agreement between two completely different determination of /J^cr? (one 
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FIG. 7. Average of spatial correlations C(r) for all French elections (absent the 2 Municipales elections). In dashed lines: 
/3'^o"0 ~ 0.035, as extracted from the asymptotic (A'^ oo) dependence of a%. 




FIG. 8. Normalized spatial correlations C'{r) of r' = r — mjv for all countries for which the geographic position of cities is 
available. The correlation is normalized by the variance of r', such that C(r = 0) = 1. For labels of elections, see Figs. [Tl[2l 



using the extrapolation of cr^ to infinite sizes, and the second using C(r)) holds very well for France, Italy and the 
Czech Republic, and only approximately for other countries (see Tab. I VII and Fig. [5]). 

Inspired by a well-known model in statistical physics where these logarithmic correlations appear, we postulated in 
that the field (/) evolves according to a diffusion equation, driven by a random noise, which is meant to describe 
the exchange of ideas and opinions between nearby cities and the random nature of the shocks that may affect the 
cultural substrate. As we argued in Pi|, the fact that people move around and carry with them some components 
of the local cultural specificity leads to a local propagation of (f){Rcnt). Through human interactions, the cultural 
differences between nearby cities tend to narrow according to: 



dt 



(17) 



where Tais^rais) > is a symmetric influence matrix, that we assume to decrease over a distance corresponding to 
regular displacements of individuals, say 10 km or so. For concreteness, we take: rQ,^(r) = Foe"'"/^''. As is well 
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FIG. 9. Average of spatial correlation, rescaled. Left: Average over numerical simulations of the model (with £c = 4.5 km) 
with the true positions of all cities for each country. Right: Average over real election data for each country. We also 
shown the average and standard deviation (coming from different realizations of the noise history -q, and plotted as error bars) 
corresponding to the numerical model for French cities. 



known, the continuum limit of the right hand side of Eq. (|17p reads DA(j){R,t), where A is the Laplacian and 
D{Ra) = ^ '^0/3-^0/3 is a measure of the speed at which the cultural field diffuses. Random cultural "shocks" add 
to the above equation a noise term r]{Ra,t). 

If cities were located on the nodes of a regular lattice of linear size L, it would be easy to compute analytically the 
stationary correlation function of the field 0. It is found to be given by a logarithm function of distance, provided 
L>4: 



(18) 



However, the spatial distribution of cities in real countries is quite strongly heterogeneous, which leads to significant 
deviation from a pure logarithmic decay. In order to compare quantitatively our model with empirical data, we have 
therefore simulated the model using Eq. P7|) with the exact locations of all cities for the different countries under 
consideration. The results, averaged over many histories of the noise term, are shown in Fig. I^l-left for £c = 4.5 km, 
(but changing £c from 1.5 km to 9 km hardly changes the curves). Quite remarkably, we see that C^(r) exhibits a 
significant concavity, very similar to what is observed for the empirical correlations. In order to see that the model 
is indeed compatible with observations, we have plotted in Fig. |9]-right the empirical data superimposed with the 
prediction of the model for the French case (for which the data is best). The empirical correlation C{r) is rescaled by 
a country dependent value C* in order to achieve the best rescaling. This value of C* allows us to obtain a second 
determination of P'^cr'l, through the relation: 



2„2 



fl2„2 I 



c* 



(19) 



Note however that the numerical model predicts a rather large dispersion around the average result, that comes from 
a strong dependence on the noise realisation ri{Ra,t). One should therefore expect that the empirical data (which 
corresponds to only a few histories) departs from the average theoretical curve, in a way perfectly compatible with 
Fig. |9]-right. This also means that there is quite a bit of leeway in determination of C*, which is only determined to 
within ±20%. Finally, note that the shape of C(r) for Germany is significantly different, with a pronounced change 
of regime around r « 70 km. This is clearly related to the strong regional idiosyncracies that we discussed in Fig. [51 

We conclude that our numerical model reproduces very satisfactorily the observations for all studied countries 
(with the possible exception of Germany, for the reason noted above). This lends strong support to the existence, 
conjectured in [lo| . of an underlying diffusive cultural field responsible for both the long-range correlation (in space) 
and persistence (in time) of voting habits. 
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VI. CONCLUSION 

In this paper, we have shown that the empirical results for the statistics of turnout rates established in [lo| for 
some French elections appear to hold much more generally. We believe that the most striking result is the logarithmic 
dependence of the spatial correlations of these turnout rates. This result is quantitatively reproduced by a decision 
model that assumes that each voter makes his mind as a result of three influence terms: one totally idiosyncratic 
component, one city-specific term with short-ranged fluctuations in space, and one long-ranged correlated field which 
propagates diffusively in space. The sum of these three contributions is what we call the "intention" . A detailed 
analysis of our data sets has revealed several interesting (and sometimes unexpected) features: a) the city-specific 
term has a variance that depends on the size N of the city as iV~" with lu < 1, suggesting strong inter-city correlations; 
b) different countries have different degrees of local heterogeneities, defined as the ratio of the variance of the city- 
dependent term over the variance of the cultural field; c) different countries seem to be characterized by a different 
propensity for individuals to conform to a cultural norm; d) there are clear signs of herding (i.e. strongly correlated 
decisions at the individual level) in some countries, but not in others; e) the statistics of the logarithmic turnout rates 
become more and more Gaussian as N increases. 

Although we have confirmed the existence of a diffusive cultural field using election data from different countries, we 
feel that more work should be done to establish the general relevance of this idea to other decision making processes. 
It would be extremely interesting to find other data sets that would enable one to study the spatial correlations of 
decision making. An obvious candidate would be consumer habits - for example the consumption pattern of some 
generic goods, or the success of some movie, etc. 

Finally, we believe that our detailed analysis of the statistics of turnout rates (or more generally of election results) 
reveals both stable patterns and subtle features, that could be used to test for possible data manipulation or frauds, 
or to define interesting "democracy" indexes. In that respect, the existence of strong herding effects in some countries 
is somewhat disturbing. 

VII. MATERIALS AND METHODS 

The Appendix gives more information about the set of (public) electoral data studied in this paper. Most of them 
can be directly downloaded from official websites (see References). 

Average values and standard-deviations do not take into account extreme values in order to remove some electoral 
errors, etc. Electoral values greater than 5 sigma are not taken into account 0. 

APPENDIX: DETAILS ON THE DATA SOURCES 

Table I VIII shows the nature of the 77 national elections from 11 countries, studied at the municipality scale. 
Countries are: France (Fr)0, Austria {Atf\, Poland (PI), Germany (Gefl Spain (Sp), Italy (It), Swiss (CHj3, Czech 
Republic (Cz), Canada (Ca), Romania (Roj^ and Mexico (Mx). Note that all the studied elections occurred in a 
same time over all the country (apart from 2 Lander elections in Germany) and are free of compulsory voting. Lastly, 
in our database for Germany, postal votes (Briehwahlen) are taken into account in some Lander, not in others, which 
artificially increases turnout heterogeneity between German regions. 

Moreover Election turnout statistics have been located, identified and geocoded, based on a set of points, which 
were obtained by calculating the gravity center of each municipality or the position of the town-hall, and then adding 
the X and Y coordinates for each of these features. In addition to these coordinates, the objects are described with 
several attributes: logarithmic turnout rate, r, normalized logarithmic turnout rate, v, etc. This concerns 8 countries 
amongst the 11 previous onespl: Austria [l^l, Czech Republic France [1^, Germany [2^, Italy [13 , Poland .2^ . 
Spain [2^ and Switzerland [3C|. This study is limited to mainland municipalities (and each considered country have 
more than two thousands municipalities). Lambert 2 etendu is used for France, while WGS 84 coordinate system is 
used for other countries. 



^ For instance let 100 municipalities of size A'^ (as in Fig.[2]|, each one has a LTR Ti (i = 1, 2, 100). First, (r) and a are the average 
value and the standard-deviation of t over these 100 municipalities. Next, the final average value mjv and the final standard-deviation, 
crpf, over this sample of 100 municipalities are uniquely evaluated for municipalities, i, such that |r,; — (t)| < 5 ct. 

® 1994 and 2004 Regionales elections occurred at the same time as strictly local elections {cantonales, i.e. at a kind of county level) in 
half of municipalities. 

Postal votes ( Wahlkarten) are not taking account in this paper. 

* Land Parliament elections at time less or equal to 2004 (or 2010) in each Land are written here as '2004-Ld' (or '2010-Ld'). 

^ The referendums or votations (R(a) and R(b)) respectively occurred on March 11th and July 17th 2007. 
The referendum studied here (about the Parlament unicameral and the reduction of the maximum of deputies) occurred at the same 
time than the first round of the Presidential election. Some Romanian electors, not registered in the lista electorala permanenta, are 
able to vote. For this country, we pursue to write N the Number of Register Voters, the registered electors who take part to the 
election. 

The Mexican spatial repartition of municipalities is so widely heterogeneous than the spatial study made for other countries is no longer 
efficient here. 
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Ctry #el 


mun 


spa 


elections 
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Q«nnn 

ouuuu 
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1992-R, 1993-D, 1994-E, 1995-Pl, 1995-P2, 1997-D, 1998-rg, 1999-E, 2000-R, 2001-mun, 2002-Pl, 


2002-P2, 2002-D, 2004-rg, 2004-E, 2005-R, 2007-Pl, 2007-P2, 2007-D, 2008-mun, 2009-E, 2010-rg 


At 


13 


2400 


Y 


1994-D, 1995-D, 1996-E, 1998-P, 1999-E, 1999-D, 2002-D, 2004-P, 2004-E, 2006-D, 2008-D, 2009-E, 2010-P 


PI 


11 


2500 


Y 


2000-Pl, 2001-D, 2003-R, 2004-E, 2005-D, 2005-Pl, 2005-P2, 2007-D, 2009-E, 2010-Pl, 2010-P2 


Ge 


7 


12000 


Y 


2002-D, 2004-Ld, 2005-D, 2009-E, 2009-D, 2010-Ld 


Sp 


4 


8000 


Y 


2004-D, 2004-E, 2008-D, 2009-E 


It 


4 


7200 


Y 


2004-E, 2006-D, 2008-D, 2009-E 


CH 


3 


2700 


Y 


2007-R{i), 2007-R(2), 2007-D 


Cz 
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6200 


Y 


2003-R 


Ca 


5 


7700 


N 


1997-D, 2000-D, 2004-D, 2006-D, 2008-D 


Ro 


4 


3200 


N 


2009-E, 2009-R, 2009-Pl, 2009-P2 


Mx 


3 


2400 


N 


2003-D, 2006-D, 2009-D 



TABLE VII. Nature of elections studied in this paper. For each country (Ctry), the number of elections (#el) and the number 
of municipalities(mun) in the mainland are written. "Y" (or reversely "N") mentions that municipalities are spatially (spa) 
localized. For each country, an election is identified by its year date and its nature. D: Chamber of Deputies election; E: 
European parliament election; P: presidential election (according to the constitution of the country, in only one round); PI and 
P2: first and second round of a Presidential election; R: Referendum; Ld: German Lanier elections; rg: French Regionales 
elections; mun: French municipales. For each country elections are given in a chronological order (but the 2009 Romanian 
Presidential (P) and Referendum (R) elections occurred the same day). Even if an election needs two rounds, only the first one 
is considered (e.g. the French Chamber of Deputies (D), Regionales (rg) and municipales (mun) elections) unless the contrary 
is indicated (e.g. PI and P2). 
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